home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
icon
/
newsgrp
/
group92c.txt
/
000078_icon-group-sender _Tue Nov 10 13:31:18 1992.msg
< prev
next >
Wrap
Internet Message Format
|
1993-01-04
|
4KB
Received: by cheltenham.cs.arizona.edu; Thu, 12 Nov 1992 16:09:21 MST
Date: 10 Nov 92 13:31:18 GMT
From: destroyer!news.iastate.edu!kelvin@gumby.wisc.edu (Kelvin Don Nilsen)
Organization: Iowa State University, Ames IA
Subject: Re: file scanning
Message-Id: <kelvin.721402278@kickapoo.cs.iastate.edu>
References: <199211092202.AA25256@optima.cs.arizona.edu>
Sender: icon-group-request@cs.arizona.edu
To: icon-group@cs.arizona.edu
Status: R
Errors-To: icon-group-errors@cs.arizona.edu
In <199211092202.AA25256@optima.cs.arizona.edu> "Kenneth Walker" <kwalker@cs.arizona.edu> writes:
>Kelvin Nilsen's Conicon has a stream data type the subsumes files and
>supports a type of scanning. His scanning doesn't have tab() and move()
>instead it has probe() and advance() which aslo replace read(). While
>you can backtrack during scanning, I think there are limitations on
>explicitly scanning backwards. He allows for infinite streams along the
>lines of Unix pipes. I know he does some careful buffering to make
>things work.
patterns that require backtracking keep a pointer to the backtracking
point in the suspended scanning expression's activation frame.
this prevents the garbage collector from reclaiming that part of
the stream that might need to be revisited. once control leaves
the bounded expression that holds the suspended scanning expression,
the backtrack point no longer exist, and the garbage collector
is free to reclaim that part of the stream that is known never to be
visited again.
>For files that support seek(), it should be possible to do scanning
>with buffering as you suggested. It means that every file must be
>buffered within the Icon run-time system and all string analysis
>functions have to be changed to work on files and extend the buffer
>when needed. Note that seeking backwards in a file using translation
>mode on systems that have multi-character line termination (like
>MS-DOS) may be a little tricky.
i avoided backward seeks by placing restrictions on the legal argument
values for advance() and probe(). in particular, it was not legal to
specify negative indices.
>There is the question of where you put the buffer. Putting it in the
>string region makes some string handling operations easier; portions
>of the input are normal strings. However, it may be necessary to copy
>the buffer to the end of the region to extent it. How do you decide how
>much to copy? Suppose you backtrack past a point where you did a read-ahead
>to extend the buffer; you probably don't want to loose the read-ahead.
>How do you handle that? I haven't look closely at Kelvin's work in
>several years, so I don't recall how he does these things. But it doesn't
>sound trivial.
in my work, i was experimenting with language design as well as
implementation techniques. i did not use icon's traditional garbage
collector. the closest Icon analog to the data structures i used is
a list of strings. in the event that your pattern spans the boundary
between two list elements, the underlying implementation replaces one
of the two neighboring elements with the catenation of the two and removes
the second list element.
in theory, this could be done entirely within Icon, but you would need to
implement all of Icon's built-in scanning functions from scratch. it
would likely run much slower than traditional scanning, but perhaps you
are willing to tolerate the performance problems in return for the increased
expressive power.
Kelvin Nilsen/Dept. of Computer Science/Iowa State University/Ames, IA 50011
(515) 294-2259 kelvin@cs.iastate.edu uunet!kelvin@cs.iastate.edu
--
Kelvin Nilsen/Dept. of Computer Science/Iowa State University/Ames, IA 50011
(515) 294-2259 kelvin@cs.iastate.edu uunet!kelvin@cs.iastate.edu